XPEDIA: XML ProcEssing for Data IntegrAtion

نویسندگان

  • Manish Bhide
  • Manoj K. Agarwal
  • Amir Bar-Or
  • Sriram Padmanabhan
  • Srinivas Mittapalli
  • Girish Venkatachaliah
چکیده

Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms need to provide more direct operations on XML as well as improve the scalability and efficiency of these operations. In this paper, we describe a robust and comprehensive framework for performing Extract-Transform-Load (ETL) of XML. This includes (i) full computational model and engine capabilities to perform these operations in an ETL flow, (ii) an approach to pushing down XML operations into a database engine capable of supporting XML processing, and (iii) methods to apply partitioning techniques to provide scalable, parallel processing for large XML documents. We describe experimental results showing the effectiveness of these techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grid Data Integration Based on Schema Mapping

Data integration is the flexible and managed federation, analysis, and processing of data from different distributed sources. Data integration is a key issue for exploiting the availability of large, heterogeneous, distributed and highly dynamic data volumes on Grids. This paper presents a framework for integrating heterogeneous XML data sources distributed among the nodes of a Grid. We present...

متن کامل

Querying Semi-structured Data with Mutual Exclusion

Data analytics applications, content-based collaborative platforms and office applications require the integration and management of current and historical data from heterogeneous sources. XML is a standard data format for information. Thanks to its semi-structured-ness, it is a good candidate data model for the integration and management of heterogeneous content. However, the management of his...

متن کامل

Converting XML Data To UML Diagrams For Conceptual Data Integration

The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. In many situations a logical (rather than physical) integration of data is preferable since some data is inherently not suited for storing in a physically integrated data warehouse. Previous web-based data integration efforts have focused almost exclusively on the logica...

متن کامل

A Model-Based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. Extensible Mark-up Language is fast becoming the new standard for data representation and exchange on the World Wide Web, e.g., in B2B e-commerce, making it necessary for data analysis tools to handle XML data as well as traditional data formats. This paper presents arch...

متن کامل

Towards Linked Data based Enterprise Information Integration

Data integration in large enterprises is a crucial but at the same time costly, long lasting and challenging problem. In the last decade, the prevalent data integration approaches were primarily based on XML, Web Services and Service Oriented Architectures (SOA). We argue that classic SOA architectures may be well-suited for transaction processing, however more efficient technologies can be emp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009